22 research outputs found
Covariate-Dependent Clustering of Undirected Networks with Brain-Imaging Data
This article focuses on model-based clustering of subjects based on the shared relationships of subject-specific networks and covariates in scenarios when there are differences in the relationship between networks and covariates for different groups of
subjects. It is also of interest to identify the network nodes significantly associated with each covariate in each cluster of subjects. To address these methodological questions, we propose a novel nonparametric Bayesian mixture modeling framework with
an undirected network response and scalar predictors. The symmetric matrix coefficients corresponding to the scalar predictors of interest in each mixture component involve low-rankness and group sparsity within the low-rank structure. While the low-rank structure in the network coefficients adds parsimony and computational efficiency, the group sparsity within the low-rank structure enables drawing inference on network nodes and cells significantly associated with each scalar predictor. Being a principled
Bayesian mixture modeling framework, our approach allows model-based identification of the number of clusters, offers clustering uncertainty in terms of the co-clustering matrix and presents precise characterization of uncertainty in identifying network nodes
significantly related to a predictor in each cluster. Empirical results in various simulation scenarios illustrate substantial inferential gains of the proposed framework in comparison with competitors. Analysis of a real brain connectome dataset using the
proposed method provides interesting insights into the brain regions of interest (ROIs) significantly related to creative achievement in each cluster of subjects.NSF-DMS 2220840, NSF-DMS 221067
Simultaneous Causal Inference and Probabilistic Record Linkage in Observational Studies with Covariates Spread Over Two Files
We consider observational studies with data spread over two files. One file includes the treatment, outcome, and some covariates measured on a set of individuals, and the other file includes additional covariates measured on a partially intersecting set of individuals. In absence of direct identifiers, researchers typically estimate causal effects in two stages: construct a linked database with probabilistic record linkage, then apply causal estimators on the linked data.
This approach does not take advantage of relationships among the variables to improve the linkage quality. It also
does not propagate uncertainty from imperfect linkages to the causal inferences. We address these shortcomings via a Bayesian joint modeling framework for simultaneous causal inference and probabilistic record linkage.
The Markov chain Monte Carlo sampler generates multiple plausible linked data files as byproducts. We use these datasets for multiple imputation inferences with two causal estimators, one regression-adjusted and the other unadjusted, based on propensity score overlap weights.
Using simulations and data from the Italian Survey on Household Income and Wealth, we show that the joint model with both estimators can improve the accuracy of estimated treatment effects compared to analogous two stage procedures
"The fruits of independence": Satyajit Ray, Indian nationhood and the spectre of empire
Challenging the longstanding consensus that Satyajit Ray's work is largely free of ideological concerns and notable only for its humanistic richness, this article shows with reference to representations of British colonialism and Indian nationhood that Ray's films and stories are marked deeply and consistently by a distinctively Bengali variety of liberalism. Drawn from an ongoing biographical project, it commences with an overview of the nationalist milieu in which Ray grew up and emphasizes the preoccupation with colonialism and nationalism that marked his earliest unfilmed scripts. It then shows with case studies of Kanchanjangha (1962), Charulata (1964), First Class Kamra (First-Class Compartment, 1981), Pratidwandi (The Adversary, 1970), Shatranj ke Khilari (The Chess Players, 1977), Agantuk (The Stranger, 1991) and Robertsoner Ruby (Robertson's Ruby, 1992) how Ray's mature work continued to combine a strongly anti-colonial viewpoint with a shifting perspective on Indian nationhood and an unequivocal commitment to cultural cosmopolitanism. Analysing how Ray articulated his ideological positions through the quintessentially liberal device of complexly staged debates that were apparently free, but in fact closed by the scenarist/director on ideologically specific notes, this article concludes that Ray's reputation as an all-forgiving, ‘everybody-has-his-reasons’ humanist is based on simplistic or even tendentious readings of his work
Recommended from our members
On Bayesian Methods in Network Regression
There has been a growing interest during recent years in connectomics, which is the study of interconnections or networks within the human brain. This interest has been spurred by the development of new imaging technologies, which allow researchers to peer non-invasively into the human brain and obtain data on connections. Motivated by these datasets, this dissertation develops a novel class of Bayesian regression models which study the relationships between neuro-scientific phenotypes and brain connectome networks of individuals.First, we introduce a novel approach that develops a regression framework of the brain network (represented in the form of a symmetric matrix) on a continuous phenotypic response. We propose a novel network shrinkage prior on the network predictor coefficient matrix. The proposed framework is able to identify nodes or functional regions in the brain network and interconnections between different regions, significantly related to the phenotypic response. To the best of our knowledge, our framework is the first principled Bayesian framework that enables identification of network nodes and edges significantly relatedto the response. The performance of the proposed model is evaluated with respect to a wide range of existing competitors available in the high dimensional frequentist and Bayesian literature using a variety of simulation studies. The proposed model identifies important brain regions and interconnections significantly associated with creativity for a group of subjects.Next, we extend our model to build network classifiers when a brain connectome network along with a binary response is provided for a group of individuals. Here we develop a broader class of global-local network shrinkage priors which includes the novel prior distribution specified earlier as a special case. We specifically consider two different global-local network shrinkage priors from this class of priors and investigate them using simulation studies. In particular, we assess their performance in terms of network classification and identifying influential network nodes and edges for the purpose of classification. We also demonstrate superior performance of our proposed network classifiers over state-of-the-art high dimensional classification techniques. Another major contribution remains developing theoretical conditions to guarantee asymptotically consistent classification for the proposed framework. In particular, we derive conditions on the number of network nodes, sparsity in the network coefficient matrix as a function of the sample size to achieve asymptotically optimal classification. While theoretical results on high dimensional binary regression with ordinary shrinkage priors have emerged recently, developing theory for our network classifier model involves several additional challenges due to the complex nature of the global local shrinkage prior developed here. The framework is used to classify individuals into high and low IQ groups based on their brain connectomes.Notably, the work discussed in the last two paragraphs tacitly assumes that all nodes and edges have similar impact on a phenotype for every individual. In our next project, we study a brain connectome data where this assumption is violated. In fact, there is a relatively less developed literature in neuroscience that argues for different groups of individuals having shared relationships between brain networks and phenotypes, though this literature lacks a principled Bayesian approach that takes into account different relationships of nodes and edges with the response for different groups of individuals and facilitates clustering of individuals. Motivated by this problem and our dataset, we have developed a Bayesian network mixture regression model. Simulation studies and analysis of the brain connectome dataset demonstrate superior performance of the proposed approach over the approach described earlier. Simulation studies are also used to evaluate the performance of the proposed approach by varying the true and fitted number of clusters, size of the network and sample size.For these projects, computationally efficient Bayesian sampling algorithms are developed to enable computations even for reasonably large networks in presence of moderately large sample size
Bayesian Regression with Undirected Network Predictors with an Application to Brain Connectome Data
We propose a Bayesian approach to regression with a continuous scalar response and an undirected network predictor. Undirected network predictors are often expressed in terms of symmetric adjacency matrices, with rows and columns of the matrix representing the nodes, and zero entries signifying no association between two corresponding nodes. Network predictor matrices are typically vectorized prior to any analysis, thus failing to account for the important structural information in the network.
This results in poor inferential and predictive performance in presence of small sample sizes.
We propose a novel class of network shrinkage priors for the coefficient corresponding to the undirected network predictor. The proposed framework is devised to detect both nodes and edges in the network predictive of the response. Our framework is implemented using an efficient Markov Chain Monte Carlo algorithm. Empirical results in simulation studies illustrate strikingly superior inferential and predictive gains of the proposed framework in comparison with the ordinary high dimensional Bayesian shrinkage priors and penalized optimization schemes. We apply our method to a brain connectome dataset that contains information on brain networks along with a measure of creativity for multiple individuals. Here, interest lies in building a regression model of the creativity measure on the network predictor to identify important regions and connections in the brain strongly associated with creativity. To the best of our knowledge, our approach is the first principled Bayesian method that is able to detect scientifically interpretable regions and connections in the brain actively impacting the continuous response (creativity) in the presence of a small sample size.Non UBCUnreviewedAuthor affiliation: University of California at Santa CruzGraduat
Bayesian Covariate-Dependent Clustering of Undirected Networks with Brain-Imaging Data
This article focuses on model-based clustering of subjects based on the shared relationships of subject-specific networks and covariates in scenarios when there are differences in the relationship between networks and covariates for different groups of subjects. It is also of interest to identify the network nodes significantly associated with each covariate in each cluster of subjects.
To address these methodological questions, we propose a novel nonparametric Bayesian mixture modeling framework with an undirected network response and scalar predictors. The symmetric matrix coefficients corresponding to the scalar predictors of interest in each mixture component involve low-rankness and group sparsity within the low-rank structure. While the low-rank structure in the network coefficients adds parsimony and computational efficiency, the group sparsity within the low-rank structure enables drawing inference on network nodes and cells significantly associated with each scalar predictor. Our principled Bayesian framework allows precise characterization of uncertainty in identifying significant network nodes in each cluster. Empirical results in various simulation scenarios illustrate substantial inferential gains of the proposed framework in comparison with competitors. Analysis of a real brain connectome dataset using the proposed method provides interesting insights into the brain regions of interest (ROIs) significantly related to creative achievement in each cluster of subjects
Bayesian Causal Inference with Bipartite Record Linkage
In some scenarios, the observational data needed for causal inferences are spread over two data files. In particular, we consider scenarios where one file includes covariates and the treatment measured on a set of individuals, and a second file includes responses measured on another, partially overlapping set of individuals. In the absence of error-free direct identifiers like social security numbers, straightforward merging of separate files is not feasible, so that records must be linked using error-prone variables such as names, birth dates, and demographic characteristics. Typical practice in such situations generally follows a two-stage procedure: first link the two files using a probabilistic linkage technique, then make causal inferences with the linked dataset. This does not propagate uncertainty due to imperfect linkages to the causal inference, nor does it leverage relationships among the study variables to improve the quality of the linkages. We propose a joint model for simultaneous Bayesian inference on probabilistic linkage and causal effects that addresses these deficiencies. Using simulation studies and theoretical arguments, we show that the joint model can improve the accuracy of estimated treatment effects, as well as the record linkages, compared to the twostage modeling option. We illustrate the joint model using a constructed causal study of the effects of debit card possession on household spending